Cheshire II at GeoCLEF: Fusion and Query Expansion for GIR

نویسنده

  • Ray R. Larson
چکیده

In this paper I will describe the Berkeley (group 1) approach to the GeoCLEF task for CLEF 2005. The main technique we are testing is the fusion of multiple probabilistic searches against different XML components using both Logistic Regression (LR) algorithms and a version of the Okapi BM-25 algorithm. We also combine multiple translations of queries in cross-language searching. Since this is the first time that the Cheshire system has been used for CLEF this approach can, at best, be considered a very preliminary base testing of some retrieval algorithms and approaches. The primary geographically based approaches taken for GeoCLEF were to georeference proper nouns in the text using a gazetteer derived from the World Gazetteer with both English and German names for each place, and to expand place names for regions or countries in the queries by the names of the countries or cities in those regions or countries.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cheshire at GeoCLEF 2008: Text and Fusion Approaches for GIR

In this paper we will briefly describe the approaches taken by Berkeley for the main GeoCLEF 2008 tasks (Mono and Bilingual retrieval). The approach this year used probabilistic text retrieval based on logistic regression and incorporating blind relevance feedback for all of the runs and in addition we ran a number of tests combining this type of search with OKAPI BM25 searches using a fusion a...

متن کامل

Re-Ranking for Geo-Relevance With Non-Contextual Heuristics at GeoCLEF 2007

Geographic Information Retrieval (GIR) in an attempt to improve relevance by taking geographic information in textual documents into account. We describe out experiments carried out at the GeoCLEF 2007 evaluation [1] that investigate further the role of geo-filtering based re-ranking and query expansion with geographic terms. Our main findings are that manual query expansion with geo-terms is m...

متن کامل

Using Geographic Signatures as Query and Document Scopes in Geographic IR

This paper reports the participation of the University of Lisbon at the 2007 GeoCLEF task. We adopted a novel approach for GIR, focused on handling geographic features and feature types on both queries and documents, generating signatures with multiple geographic concepts as a scope of interest. We experimented new query expansion and text mining strategies, relevance feedback approaches and ra...

متن کامل

The University of Lisbon at GeoCLEF 2007

This paper reports the participation of the XLDB Group from the University of Lisbon at the 2007 GeoCLEF task. We adopted a novel approach for GIR, focused on handling geographic features and feature types on both queries and documents, generating geographic signatures with multiple geographic concepts as a scope of interest. We experimented new query expansion and text mining strategies, relev...

متن کامل

SINAI-GIR System. University of Jaén at GeoCLEF 2008

This paper describes the third participation of the SINAI research group from University of Jaén in GeoCLEF track. We have tried to improve the system proposed last year in GeoCLEF 2007. The main developments are related to the use of query reformulation, keywords recognition, hyponyms extraction and query geo-expansion. On the other hand, new rules have been applied in the Validator subsystem ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005